Pectobacterium araliae sp. nov., a pathogen causing bacterial soft rot of Japanese angelica tree in Japan

Abstract Phytopathogenic bacteria (MAFF 302110T and MAFF 302107) were isolated from lesions on Japanese angelica trees affected by bacterial soft rot in Yamanashi Prefecture, Japan. The strains were Gram-reaction-negative, facultatively anaerobic, motile with peritrichous flagella, rod-shaped, and non-spore-forming. The genomic DNA G+C content was 51.1 mol % and the predominant cellular fatty acids included summed feature 3 (C16 : 1 ω7c and/or C16 : 1 ω6c), C16 : 0, summed feature 8 (C18 : 1 ω7c and/or C18 : 1 ω6c), summed feature 2 (comprising any combination of C12 : 0 aldehyde, an unknown fatty acid with an equivalent chain length of 10.928, C16 : 1 iso I, and C14 : 0 3OH), and C12 : 0. Phylogenetic analyses based on 16S rRNA and gyrB gene sequences, along with phylogenomic analysis utilizing whole-genome sequences, consistently placed these strains within the genus Pectobacterium. However, their phylogenetic positions did not align with any known species within the genus. Comparative studies involving average nucleotide identity and digital DNA–DNA hybridization with the closely related species indicated values below the thresholds employed for the prokaryotic species delineation (95–96 % and 70 %, respectively), with the highest values observed for Pectobacterium polonicum DPMP315T (92.10 and 47.1 %, respectively). Phenotypic characteristics, cellular fatty acid composition, and a repertoire of secretion systems could differentiate the strains from their closest relatives. The phenotypic, chemotaxonomic, and genotypic data obtained in this study show that MAFF 302110T/MAFF 302107 represent a novel species of the genus Pectobacterium, for which we propose the name Pectobacterium araliae sp. nov., designating MAFF 302110T (=ICMP 25161T) as the type strain.


INTRODUCTION
The genus Pectobacterium, belonging to the family Pectobacteriaceae of the order Enterobacterales, encompasses a diverse group of Gram-negative, rod-shaped, pectinolytic bacteria responsible for soft rot diseases in a wide array of plant hosts [1][2][3][4][5][6].These bacteria are characterized by their ability to secrete a vast repertoire of pectinolytic and cellulolytic enzymes that break down plant cell walls, leading to characteristic soft mushy lesions associated with soft rot diseases.They have a global distribution and exert a significant impact on agriculture, causing substantial economic losses because of reduced yields and crop production [7,8].Whole-genome sequencing technologies have driven recent revisions to the taxonomy of the order Enterobacterales, including the genus Pectobacterium, leading to the reclassifications and recognition of new species [2][3][4][5][6]9].At the time of writing this paper, 20 species with validly published and correct names existed in the genus Pectobacterium according to the List of Prokaryotic names with Standing in Nomenclature (https://lpsn.dsmz.de)[10].
The Japanese angelica tree, or Japanese aralia (Aralia elata), is a deciduous shrub of the family Araliaceae that is distributed in Japan and eastern Asia.In Japan, it has long been customary to eat its new shoots and the trees have been cultivated extensively, especially since the 1980s.In 1987, soft rot symptoms were observed in the new shoots and stems of Japanese angelica trees cultivated in Yamanashi Prefecture, Japan (Fig. S1A, available in the online version of this article).The affected shoots turned blackish brown, the stems turned brown and decayed, and the decaying parts eventually died.In some cases, soft rot symptoms also occurred in the roots of intensely affected trees [11].
Ono et al. [11] isolated a causative bacterium from the diseased tissues, confirmed its pathogenicity in Japanese angelica trees (Fig. S1B), and investigated its phenotypic characteristics.Based on these results, the causative bacterium was identified as Erwinia carotovora subsp.carotovora and the disease was named 'bacterial soft rot of Japanese angelica tree' [12].However, following subsequent major changes in the classification of the order Enterobacterales [1,6,9], which also includes the genus Erwinia, a preliminary re-examination of the taxonomic affiliation of the causative bacterium deposited in the Genebank Project (www.gene.affrc.go.jp/databases-micro_ search_en.php) of the National Agriculture and Food Research Organization (NARO), Japan, was performed.These results suggested that the causative bacterium may not belong to the genus Erwinia, but to a new species of the genus Pectobacterium.
The objective of this study was to clarify the taxonomic affiliation of MAFF 302110 T (=ICMP 25161 T ) and MAFF 302107 (=ICMP 25162), both of which are the pathogen of soft rot disease of Japanese angelica tree, identified as E. carotovora and deposited in the NARO Genebank, Japan, by Ono et al. [11].To this end, an extensive comparative analysis of these strains in conjunction with related species was conducted using a polyphasic approach.Our findings indicate that they are members of the genus Pectobacterium and represent a novel species that we propose to name Pectobacterium araliae sp.nov.

GENOME FEATURES
For the whole genome sequencing, MAFF 302110 T and MAFF 302107 were cultured in Luria-Bertani broth for 18 h at 30 °C while shaking and their total genomic DNA was extracted using a NucleoSpin Tissue Kit (Takara Bio) following the manufacturer's protocol.The complete genome sequence of MAFF 302110 T was obtained using a PacBio Revio (Pacific Biosciences) using libraries prepared with a SMRTbell gDNA Sample Amplification Kit and SMRTbell Express Template Prep Kit 2.0 (Pacific Biosciences), from Seibutsu Giken Inc. (Kanagawa, Japan).Sequencing reads were assembled using Canu version 1.9 [13] and Flye version 2.9.2 [14].To obtain the draft genome sequence of MAFF 302107, library construction and sequencing, using the HiSeq X Ten platform (Illumina), were performed by Eurofins Genomics Inc. (Tokyo, Japan).Adapter-trimmed raw reads were assembled using SPAdes version 3.13.0[15].The assemblies obtained (Table S1) were annotated using the DDBJ Fast Annotation and Submission Tool (DFAST; https://dfast.ddbj.nig.ac.jp), which is a prokaryotic genome annotation pipeline [16].Briefly, protein-coding sequences were predicted using MetaGeneAnnotator [17].Genes encoding tRNA and rRNA were identified using Aragorn version 1.2.38 [18] and Barrnap version 0.8 (https://github.com/tseemann/barrnap),respectively.
The genome of MAFF 302110 T included one circular chromosome but no indigenous plasmids.The chromosome was 4 663 678 bp in size, with a G+C content of 51.1 mol% and contained 4124 protein-coding sequences, 22 rRNA genes, and 76 tRNA genes.The G+C content was within the range of 50.5-56.1 mol% reported for the Pectobacterium species [19].The genome sequencing data obtained for MAFF 302110 T and MAFF 302107 were deposited in DDBJ/ENA/GenBank under accession numbers AP028908 and BRCR00000000, respectively (Table S1).
To taxonomically evaluate the relationship between MAFF 302110 T and MAFF 302107, digital DNA-DNA hybridization (dDDH) analysis was performed using their genome sequences.For this purpose, formula 2 of the Genome-to-Genome Distance Calculator 3.0 (GGDC 3.0; https://ggdc.dsmz.de/ggdc.php)[20,21] was used.As a result, a high value of 98.8 % was obtained between the two strains, which was well above the threshold (70 %) for the delineation of prokaryotic species [22], indicating that they are included in the same species framework.
To preliminarily assess the taxonomic affiliation of MAFF 302110 T by genome analysis, the Type Strain Genome Server (TYGS; https://tygs.dsmz.de)[23] and Taxonomy Check implemented in DFAST [16] were used.The MAFF 302110 T genome sequence was uploaded to TYGS and the dDDH indices were calculated against the type strain genomes to identify the closest type strain.The results showed that although the dDDH value calculated for Pectobacterium polonicum DPMP315 T was the highest (47.1 %; Table S2), it was below the cut-off value for the prokaryotic species delineation [22].In a taxonomy check using the fast average nucleotide identity (FastANI) algorithm [24] (Table S3), the closest strain was P. polonicum DPMP315 T , but its value (92.224 % ANI) was also lower than the threshold (95-96 % ANI) for the prokaryotic species delineation [22].These results suggest that MAFF 302110 T and MAFF 302107 may represent a novel species of the genus Pectobacterium.

16S rRNA GENE ANALYSES
To comprehensively identify the bacterial species closely related to MAFF 302110 T /MAFF 302107, a homology search based on 16S rRNA gene sequences was performed as described previously [25,26].Briefly, their partial sequences were determined by direct sequencing of the PCR products, and a homology search was conducted in the EzBioCloud database (www.ezbiocloud.net/identify)[27] using the sequence of MAFF 302110 T as a query.A total of 43 known species showing high similarity to MAFF 302110 T were identified, and their 16S rRNA gene sequences were collected from the EzBioCloud database.Next, the 16S rRNA gene sequences of the species that exhibited high dDDH/ANI values with MAFF 302110 T in the aforementioned genome analyses performed using TYGS and DFAST (Tables S2 and S3) were retrieved from GenBank.Using a pairwise nucleotide sequence alignment tool (www.ezbiocloud.net/tools/pairAlign)[27] in EzBioCloud, the similarity values against the 16S rRNA gene sequence of MAFF 302110 T were calculated.Finally, by combining the data derived from EzBioCloud, TYGS, and DFAST and eliminating duplicates, 45 known species with validly published and correct names were selected as closely related to MAFF 302110 T /MAFF 302107 (Table S4).Among these, Pectobacterium quasiaquaticum A477-S1-J17 T showed the highest sequence similarity (99.36 %) to MAFF 302110 T .
To determine the phylogenetic position of MAFF 302110 T /MAFF 302107, phylogenetic analyses based on 16S rRNA gene sequences were carried out as described in our previous studies [25,26], targeting MAFF 302110 T /MAFF 302107, the 45 species (selected as closely related species of MAFF 302110 T /MAFF 302107; Table S4), and Budvicia aquatica DSM 5075 T (selected as an outgroup based on the analysis result of Adeolu et al. [9]).The 16S rRNA gene sequences were analysed using mega 11 version 11.0.13[28] with the neighbour-joining, maximum-likelihood, and maximum-parsimony methods.The reliability of the tree was tested using the standard bootstrap method with 1000 replications.In the resulting phylogenetic trees, MAFF 302110 T and MAFF 302107 clustered as a monophyletic clade with a bootstrap value of 99 % (Fig. S2).However, the phylogenetic position of this clade did not match any of the known species used in the analyses.

GYRB GENE ANALYSES
The phylogenetic position of MAFF 302110 T /MAFF 302107 was evaluated using the gyrB gene sequences of the same members as those used in the 16S rRNA gene sequence analyses (Fig. S2).The partial sequences of MAFF 302110 T /MAFF 302107 were determined by direct PCR sequencing using the primers described by Brady et al. [29], whereas those of the 45 related species (Table S4) and Budvicia aquatica DSM 5075 T (outgroup) were extracted from the respective whole genome sequences.Phylogenetic analyses were performed using these sequences, as described previously [30].The resulting phylogenetic trees revealed that MAFF 302110 T and MAFF 302107 were tightly clustered (Fig. S3), which was similar to the results of the 16S rRNA gene sequence analyses (Fig. S2).Furthermore, these two strains were found to cluster together with the known species of the genus Pectobacterium as a monophyletic clade with a standard bootstrap value of 100 %.However, the phylogenetic position of MAFF 302110 T /MAFF 302107 was not consistent with that of any member of the genus.
Table 1 presents the ANI and dDDH values obtained for these species arranged in descending order based on their ANIb values calculated against the MAFF 302110 T genome sequence.The ANIb values between MAFF 302110 T and its closely related species ranged from 77.91 % (in the case of Huaxiibacter chinensis 155047 T ) to 91.74 % (in the case of P. polonicum DPMP315 T ), while the OrthoANIu values varied between 72.98 % (for H. chinensis 155047 T ) and 92.10 % (for P. polonicum DPMP315 T ).Notably, both ANI values were below the established cut-off for the prokaryotic species delineation [22].Additionally, the dDDH values ranged from 20.0 % (for Enterobacter soli LMG 25861 T ) to 47.1 % (for P. polonicum DPMP315 T ), which fell below the recognized threshold for prokaryotic species delineation [22].
To further elucidate the phylogenetic placement of MAFF 302110 T /MAFF 302107, a comprehensive phylogenomic analysis was performed based on the concatenated alignment of core genes that constitute the core genome of MAFF 302110 T /MAFF 302107, the 45 closely related species (Table 1), and Budvicia aquatica DSM 5075 T (an outgroup).Genomic annotations were generated de novo using Prokka [33] and the annotated genes were compared across all input genomes using the Roary pan-genome analysis pipeline [34], applying a 70 % amino acid identity threshold.This analysis identified 807 core genes that were present in all the genomes under investigation.Subsequently, the nucleotide sequences of these core genes were concatenated, and multiple alignments were conducted using mafft [35] implemented through Roary.To enhance the quality of the concatenated alignment, poorly aligned positions and divergent regions were removed using Gblocks [36] using default parameters.From the resulting alignment, which had a total length of 639 079 bp, a maximum-likelihood tree was reconstructed using RAxML-NG [37], with a general time-reversible substitution model and a gamma model of rate heterogeneity.To evaluate the reliability of the tree, the standard bootstrap method with 100 replicates was used.The resulting phylogenomic tree positioned MAFF 302110 T /MAFF Table 1.Genomic relationship between strain MAFF 302110 T and type strains of closely related species The ANIb, OrthoANIu, and dDDH values were calculated using the genome-based distance matrix calculator [31], the ANI calculator [32], and the Genome-to-Genome Distance Calculator 3.0 (formula 2) [20,21], respectively.The data shown here have been presented in descending order of their ANIb values calculated against the MAFF 302110 T genome sequence.302107 within the clade of the genus Pectobacterium (Fig. 1).However, it is important to note that the phylogenetic placement of MAFF 302110 T /MAFF 302107 did not correspond to that of any other member of the genus.

PHYSIOLOGY AND CHEMOTAXONOMY
The phenotypic characteristics of MAFF 302110 T /MAFF 302107 were compared with those of four Pectobacterium species (P.polonicum, P. parmentieri, P. punjabense, and P. wasabiae) that were closely related to MAFF 302110 T /MAFF 302107 in the phylogenomic analysis (Fig. 1).The data for the latter four species were retrieved from Waleron et al. [4,5] and Khayi et al. [2] for comparison.MAFF 302110 T /MAFF 302107 were cultured routinely at 28 °C on standard methods agar (plate count agar) plates (Nissui) and tested for the phenotypic characteristics, as described below.Cell size, morphology, and flagellar insertion were determined using transmission electron microscopy as described previously [25].Colony morphology and at 37 °C were evaluated using TY (tryptone-yeast extract) agar plates [5 g l −1 tryptone, 3 g l −1 yeast extract, and 1.5 % (w/v) agar] [2].Gram reaction (Ryu non-staining KOH method), motility, oxidase activity, and potato soft rot were examined as described by Schaad et al. [38].Catalase activity was determined as described by Lelliott et al. [39].Carbon source utilization and chemical sensitivity were evaluated using the Biolog GEN III MicroPlate system (Biolog) according to the manufacturer's protocol with the following modifications [25,40].The results of the Biolog assays were scored after 3 days of incubation at 28 °C.The Biolog assays were repeated thrice, and only the stable results are shown in Table 2 and the species description below.

FUNCTIONAL GENOMICS
Functional and metabolic reconstruction of the genome of MAFF T were performed using BlastKOALA and kegg Mapper from the Kyoto Encyclopedia of Genes and Genomes (kegg; www.kegg.jp/kegg/)[42].The kegg annotation predicted that the MAFF 302110 T genome encoded 87 complete pathway modules that were organized into the following functional subcategories in the metabolism category: carbohydrate metabolism, energy metabolism, lipid metabolism, nucleotide metabolism, amino acid metabolism, glycan metabolism, metabolism of cofactors and vitamins, biosynthesis of terpenoids and polyketides, and biosynthesis of other secondary metabolites.The carbohydrate metabolism genes (338 genes) were the most abundant in the genome (Table S5).Genes encoding flagellar biosynthesis and assembly proteins, including flgABCDEFGHI-JKLMNYZ, flhABCDE, fliACDEFGHIJKLMNOPQRST, and motAB, were predicted to be present in the genome, which is in accordance with the observation that MAFF 302110 T was motile with peritrichous flagella (Fig. S4).Additionally, numerous genes potentially involved in membrane transport (including ABC transporters, phosphotransferase systems, and bacterial secretion systems; 251 genes), signal transduction (including two-component systems; 131), cellular community (including quorum sensing and biofilm formation; 126), and bacterial chemotaxis (19) were detected (Table S5), suggesting that MAFF 302110 T may be sensitive to various environmental stimuli.
Several factors, some of which are given below, may be involved in the pathogenicity of Pectobacterium species to plants [43][44][45][46]: two-component signal transduction systems, quorum-sensing, secretion systems, carbohydrate-active enzymes (CAZymes) including plant cell wall-degrading enzymes, toxins, and siderophores.Of these, CAZymes and secretion systems are considered to be key virulence factors [47,48].Therefore, MAFF 302110 T and the type strains of the closely related Pectobacterium species were compared based on these two factors.Based on their catalytic activity and amino acid sequence similarity, CAZymes are divided into several classes, including auxiliary activities (AAs), carbohydrate-binding modules (CBMs), carbohydrate esterases (CEs), glycoside hydrolases (GHs), glycosyl transferases (GTs), and polysaccharide lyases (PLs).To assess the number of domains encoding putative CAZymes for each of the above classes, genome mining was performed using the dbCAN3 meta-server (https://bcb.unl.edu/dbCAN2/index.php)[49].If an annotation was supported by at least two of the following tools/databases in dbCAN3: diamond (CAZy), hmmer (dbCAN), or hmmer (dbCAN-sub), the hit was assigned to the corresponding CAZyme class.For the secretion systems, the total number of genes assigned to the core components of each system by kegg annotation was calculated for each strain and compared among the strains.
CAZyme analysis using dbCAN3 predicted 111 domains encoding six major classes of CAZymes in the MAFF 302110 T genome (Table S6).Among the putative CAZymes, GHs were the most abundant (47 domains) in the genome, followed by GTs (30), PLs (13), CBMs (11), CEs (8), and AAs (2).A number of putative CAZyme domains were also identified in the genomes of the closely related species, whose CAZyme profiles were quite similar to those of MAFF 302110 T (Table S6).The large and diverse set of CAZyme domains identified in the genomes of the strains investigated suggests that these enzymes may play a crucial role in their pathogenic mechanisms and/or ecology.Among the secretion systems, the number of assigned to the core components of the Type II, Sec, and Tat systems differed slightly among the strains (Table S7).The high conservation of these systems in the strains suggests that they might play a critical role in the pathogenicity of the strains, which is consistent with the previous reports that a variety of CAZymes degrading plant cell walls are secreted via the Type II secretion system [43,44,47,48].In contrast, for Type III (excluding the flagellar secretion system), IV, and VI secretion systems, the number of genes assigned varied widely among the strains (Table S7).Type IV secretion system genes were detected in MAFF 302110 T , P. parmentieri, and P. wasabiae.However, no Type III genes were detected.P. polonicum and P. punjabense were the opposite of these three strains, with Type III secretion system genes detected, but no Type IV genes.For the Type VI secretion system, no relevant core genes were found at all in P. punjabense.Among the other four strains, the numbers of Type VI genes differed significantly.The diversity observed among the strains with regard to the Type III, IV, and VI secretion systems (Table S7) may have a significant impact on the pathogenicity, virulence, host range, and/or ecology of these strains.However, to clarify this point, not only the core components of the secretion systems, but also the effectors secreted via these systems need to be comprehensively analysed and compared among the strains.

TAXONOMIC CONCLUSION
Phylogenetic analyses of 16S rRNA and gyrB gene sequences (Figs S2 and S3) and examination of the cellular fatty acid composition (Table 3) and G+C content (Table 2), as well as the preliminary genome analyses carried out at TYGS and DFAST (Tables S2 and S3), consistently suggested the affiliation of MAFF 302110 T /MAFF 302107 with the genus Pectobacterium.Phylogenomic analysis using whole-genome sequences revealed that the phylogenetic placement of the strains did not align with any known species within the genus (Fig. 1).The results of the ANIb, OrthoANIu, and dDDH analyses (Table 1) confirmed that MAFF 302110 T /MAFF 302107 represent a novel species of the genus Pectobacterium, for which we propose the name Pectobacterium araliae sp.nov., designating MAFF 302110 T (=ICMP 25161 T ) as the type strain.Distinguishing features of Pectobacterium araliae sp.nov.from its closest relatives were evident in its phenotypic characteristics (Table 2), cellular fatty acid composition (Table 3), and repertoire of secretion systems (Table S7).
Members of this species are Gram-reaction-negative, facultatively anaerobic, motile with peritrichous flagella, rod-shaped, and non-spore-forming.The cell size (mean±standard deviation) is 2.1±0.5×0.9±0.08 µm (n=25).On TY agar plates, grows well at 28 °C, forming visible colonies within 24 h, but the growth is weak at 37 °C.Colonies are pale yellow to greyish-white in colour, opaque, round with entire margins, raised, smooth, and glistening, with a diameter of approximately 1-2 mm, after culturing on TY agar plates for 48 h at 28 °C.No diffusible pigments are observed when cultured on the plates.The strains of the species are positive for potato soft rot and growth at pH 6, but negative for oxidase activity and growth at pH 5. Different catalase activity responses are observed between the strains.
The type strain, MAFF 302110 T (=ICMP 25161 T ), was isolated from a lesion formed on a Japanese angelica tree (Aralia elata) affected by bacterial soft rot, which was sampled in Yamanashi Prefecture, Japan, in 1987, and is pathogenic to a Japanese angelica tree.MAFF 302107 (=ICMP 25162) is an additional strain of this species.

Fig. 1 .
Fig.1.Phylogenomic tree reconstructed based on the concatenated alignment of 807 core genes (total alignment length of 639 079 bp), showing the relationships between Pectobacterium araliae sp.nov.strains (boldface type) and the closely related species listed in Table1.The concatenated alignment was generated using Roary[34] and Gblocks[36].The maximum-likelihood tree was inferred using RAxML-NG[37] with a general timereversible substitution model and a gamma model of rate heterogeneity.Budvicia aquatica DSM 5075 T served as an outgroup.GenBank accession numbers are shown in parentheses.The numbers at the nodes indicate standard bootstrap values from 100 replications.
Fig.1.Phylogenomic tree reconstructed based on the concatenated alignment of 807 core genes (total alignment length of 639 079 bp), showing the relationships between Pectobacterium araliae sp.nov.strains (boldface type) and the closely related species listed in Table1.The concatenated alignment was generated using Roary[34] and Gblocks[36].The maximum-likelihood tree was inferred using RAxML-NG[37] with a general timereversible substitution model and a gamma model of rate heterogeneity.Budvicia aquatica DSM 5075 T served as an outgroup.GenBank accession numbers are shown in parentheses.The numbers at the nodes indicate standard bootstrap values from 100 replications.

Table 3 .
[5]lular fatty acid composition (as percentages of the total) of strain MAFF 302110 T and the closely related Pectobacterium species Strains: 1, MAFF 302110 T ; 2, Pectobacterium polonicum DPMP315 T ; 3, Pectobacterium wasabiae CFBP 3304 T .Data for strain MAFF 302110 T were obtained in this study.The other data were from Waleron et al.[5], under the same culture and analytical conditions used in this study.-, Not detected.Fatty acids representing <1 % in all strains are not shown.The major fatty acids of each species (>5 % of the total fatty acids) are highlighted in bold.Summed feature 2 comprises any combination of C 12:0 aldehyde, an unknown fatty acid with an equivalent chain length of 10.928, C 16:1 iso I, and C 14:0 3OH.†Summed feature 3: C 16:1 ω7c and/or C 16:1 ω6c.‡Summed feature 8: C 18:1 ω7c and/or C 18:1 ω6c. *